Hierarchical structure and word strength predication of Mandarin prosody

نویسندگان

  • Greg Kochanski
  • Chilin Shih
  • Hongyan Jing
چکیده

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle dynamics and planning process that controls the tension of the vocal folds. Because Stem-ML describes the interactions between nearby tones or accents, we were able to use a highly constrained model with only one accent template for each lexical tone category, and a single prosodic strength per word. The model accurately reproduces the intonation of the speaker, capturing 87% of the variance of f0. The result reveals strong alternating metrical patterns in words, and shows that the speaker uses word strength to mark a hierarchy of boundaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Structure and Word Strength Prediction of Mandarin Prosody

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle dynamics and planning process that controls the tension of the vocal folds. Because Stem-ML describes the interactions between nearby tones or accents, we were able to use a highly constrained model wi...

متن کامل

A Statistical Model with Hierarchical Structure for Predicting Prosody in a Mandarin Text-to-speech System

In this paper we proposed a statistical prosody model with hierarchical structure for Mandarin Text-to-Speech (TTS) system. There are four levels in our model: syllable level, word level, breath group (prosodic phrase) level, and utterance level. Here “hierarchy” means that each lower level is a subset of a higher level. The prosodic information is first found in each level, and then they are c...

متن کامل

Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach

Hierarchical prosody structure generation is an important but challenging component for speech synthesis systems. In this paper, we investigate the use of enhanced embedding (joint learning of character and word embedding (CWE)) features and different model fusion approaches at both character and word level for Mandarin prosodic boundaries prediction. For CWE module, the internal structures of ...

متن کامل

Mandarin Text-to-speech Synthesis

This chapter introduces Mandarin Text-To-Speech (MTTS) synthesis. Beginning with a brief review on the development history of MTTS and attributes of MTTS, three main constituents of the technology are presented: 1) Text processing: word segmentation, disambiguation of polyphones, and analysis of rhythm structure; 2) prosodic processing: features of Mandarin prosody, and prosody prediction, and;...

متن کامل

Relative Importance of Tone and Segments for the Intelligibility of Mandarin and Cantonese

This study aims to establish the relative importance of segmental and word-prosodic properties for the intelligibility of spoken Mandarin and Cantonese. Mandarin has a relative small inventory of lexical tones (four) while Cantonese has a richer tone inventory (at least seven). Word prosody is normally redundant relative to segmental properties so that word recognition does not crucially depend...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001